1 PHP Markdown Extra 2 ================== 3 4 Version 1.2.3 - Wed 31 Dec 2008 5 6 by Michel Fortin 7 <http://www.michelf.com/> 8 9 based on Markdown by John Gruber 10 <http://daringfireball.net/> 11 12 13 Introduction 14 ------------ 15 16 This is a special version of PHP Markdown with extra features. See 17 <http://www.michelf.com/projects/php-markdown/extra/> for details. 18 19 Markdown is a text-to-HTML conversion tool for web writers. Markdown 20 allows you to write using an easy-to-read, easy-to-write plain text 21 format, then convert it to structurally valid XHTML (or HTML). 22 23 "Markdown" is two things: a plain text markup syntax, and a software 24 tool, written in Perl, that converts the plain text markup to HTML. 25 PHP Markdown is a port to PHP of the original Markdown program by 26 John Gruber. 27 28 PHP Markdown can work as a plug-in for WordPress and bBlog, as a 29 modifier for the Smarty templating engine, or as a remplacement for 30 textile formatting in any software that support textile. 31 32 Full documentation of Markdown's syntax is available on John's 33 Markdown page: <http://daringfireball.net/projects/markdown/> 34 35 36 Installation and Requirement 37 ---------------------------- 38 39 PHP Markdown requires PHP version 4.0.5 or later. 40 41 42 ### WordPress ### 43 44 PHP Markdown works with [WordPress][wp], version 1.2 or later. 45 46 [wp]: http://wordpress.org/ 47 48 1. To use PHP Markdown with WordPress, place the "makrdown.php" file 49 in the "plugins" folder. This folder is located inside 50 "wp-content" at the root of your site: 51 52 (site home)/wp-content/plugins/ 53 54 2. Activate the plugin with the administrative interface of 55 WordPress. In the "Plugins" section you will now find Markdown. 56 To activate the plugin, click on the "Activate" button on the 57 same line than Markdown. Your entries will now be formatted by 58 PHP Markdown. 59 60 3. To post Markdown content, you'll first have to disable the 61 "visual" editor in the User section of WordPress. 62 63 You can configure PHP Markdown to not apply to the comments on your 64 WordPress weblog. See the "Configuration" section below. 65 66 It is not possible at this time to apply a different set of 67 filters to different entries. All your entries will be formated by 68 PHP Markdown. This is a limitation of WordPress. If your old entries 69 are written in HTML (as opposed to another formatting syntax, like 70 Textile), they'll probably stay fine after installing Markdown. 71 72 73 ### bBlog ### 74 75 PHP Markdown also works with [bBlog][bb]. 76 77 [bb]: http://www.bblog.com/ 78 79 To use PHP Markdown with bBlog, rename "markdown.php" to 80 "modifier.markdown.php" and place the file in the "bBlog_plugins" 81 folder. This folder is located inside the "bblog" directory of 82 your site, like this: 83 84 (site home)/bblog/bBlog_plugins/modifier.markdown.php 85 86 Select "Markdown" as the "Entry Modifier" when you post a new 87 entry. This setting will only apply to the entry you are editing. 88 89 90 ### Replacing Textile in TextPattern ### 91 92 [TextPattern][tp] use [Textile][tx] to format your text. You can 93 replace Textile by Markdown in TextPattern without having to change 94 any code by using the *Texitle Compatibility Mode*. This may work 95 with other software that expect Textile too. 96 97 [tx]: http://www.textism.com/tools/textile/ 98 [tp]: http://www.textpattern.com/ 99 100 1. Rename the "markdown.php" file to "classTextile.php". This will 101 make PHP Markdown behave as if it was the actual Textile parser. 102 103 2. Replace the "classTextile.php" file TextPattern installed in your 104 web directory. It can be found in the "lib" directory: 105 106 (site home)/textpattern/lib/ 107 108 Contrary to Textile, Markdown does not convert quotes to curly ones 109 and does not convert multiple hyphens (`--` and `---`) into en- and 110 em-dashes. If you use PHP Markdown in Textile Compatibility Mode, you 111 can solve this problem by installing the "smartypants.php" file from 112 [PHP SmartyPants][psp] beside the "classTextile.php" file. The Textile 113 Compatibility Mode function will use SmartyPants automatically without 114 further modification. 115 116 [psp]: http://www.michelf.com/projects/php-smartypants/ 117 118 119 ### In Your Own Programs ### 120 121 You can use PHP Markdown easily in your current PHP program. Simply 122 include the file and then call the Markdown function on the text you 123 want to convert: 124 125 include_once "markdown.php"; 126 $my_html = Markdown($my_text); 127 128 If you wish to use PHP Markdown with another text filter function 129 built to parse HTML, you should filter the text *after* the Markdown 130 function call. This is an example with [PHP SmartyPants][psp]: 131 132 $my_html = SmartyPants(Markdown($my_text)); 133 134 135 ### With Smarty ### 136 137 If your program use the [Smarty][sm] template engine, PHP Markdown 138 can now be used as a modifier for your templates. Rename "markdown.php" 139 to "modifier.markdown.php" and put it in your smarty plugins folder. 140 141 [sm]: http://smarty.php.net/ 142 143 If you are using MovableType 3.1 or later, the Smarty plugin folder is 144 located at `(MT CGI root)/php/extlib/smarty/plugins`. This will allow 145 Markdown to work on dynamic pages. 146 147 148 ### Updating Markdown in Other Programs ### 149 150 Many web applications now ship with PHP Markdown, or have plugins to 151 perform the conversion to HTML. You can update PHP Markdown -- or 152 replace it with PHP Markdown Extra -- in many of these programs by 153 swapping the old "markdown.php" file for the new one. 154 155 Here is a short non-exhaustive list of some programs and where they 156 hide the "markdown.php" file. 157 158 | Program | Path to Markdown 159 | ------- | ---------------- 160 | [Pivot][] | `(site home)/pivot/includes/markdown/` 161 162 If you're unsure if you can do this with your application, ask the 163 developer, or wait for the developer to update his application or 164 plugin with the new version of PHP Markdown. 165 166 [Pivot]: http://pivotlog.net/ 167 168 169 Configuration 170 ------------- 171 172 By default, PHP Markdown produces XHTML output for tags with empty 173 elements. E.g.: 174 175 <br /> 176 177 Markdown can be configured to produce HTML-style tags; e.g.: 178 179 <br> 180 181 To do this, you must edit the "MARKDOWN_EMPTY_ELEMENT_SUFFIX" 182 definition below the "Global default settings" header at the start of 183 the "markdown.php" file. 184 185 186 ### WordPress-Specific Settings ### 187 188 By default, the Markdown plugin applies to both posts and comments on 189 your WordPress weblog. To deactivate one or the other, edit the 190 `MARKDOWN_WP_POSTS` or `MARKDOWN_WP_COMMENTS` definitions under the 191 "WordPress settings" header at the start of the "markdown.php" file. 192 193 194 Bugs 195 ---- 196 197 To file bug reports please send email to: 198 <michel.fortin (a] michelf.com> 199 200 Please include with your report: (1) the example input; (2) the output you 201 expected; (3) the output PHP Markdown actually produced. 202 203 204 Version History 205 --------------- 206 207 Extra 1.2.3 (31 Dec 2008): 208 209 * In WordPress pages featuring more than one post, footnote id prefixes are 210 now automatically applied with the current post ID to avoid clashes 211 between footnotes belonging to different posts. 212 213 * Fix for a bug introduced in Extra 1.2 where block-level HTML tags where 214 not detected correctly, thus the addition of erroneous `<p>` tags and 215 interpretation of their content as Markdown-formatted instead of 216 HTML-formatted. 217 218 219 Extra 1.2.2 (21 Jun 2008): 220 221 * Fixed a problem where abbreviation definitions, footnote 222 definitions and link references were stripped inside 223 fenced code blocks. 224 225 * Fixed a bug where characters such as `"` in abbreviation 226 definitions weren't properly encoded to HTML entities. 227 228 * Fixed a bug where double quotes `"` were not correctly encoded 229 as HTML entities when used inside a footnote reference id. 230 231 232 1.0.1m (21 Jun 2008): 233 234 * Lists can now have empty items. 235 236 * Rewrote the emphasis and strong emphasis parser to fix some issues 237 with odly placed and overlong markers. 238 239 240 Extra 1.2.1 (27 May 2008): 241 242 * Fixed a problem where Markdown headers and horizontal rules were 243 transformed into their HTML equivalent inside fenced code blocks. 244 245 246 Extra 1.2 (11 May 2008): 247 248 * Added fenced code block syntax which don't require indentation 249 and can start and end with blank lines. A fenced code block 250 starts with a line of consecutive tilde (~) and ends on the 251 next line with the same number of consecutive tilde. Here's an 252 example: 253 254 ~~~~~~~~~~~~ 255 Hello World! 256 ~~~~~~~~~~~~ 257 258 * Rewrote parts of the HTML block parser to better accomodate 259 fenced code blocks. 260 261 * Footnotes may now be referenced from within another footnote. 262 263 * Added programatically-settable parser property `predef_attr` for 264 predefined attribute definitions. 265 266 * Fixed an issue where an indented code block preceded by a blank 267 line containing some other whitespace would confuse the HTML 268 block parser into creating an HTML block when it should have 269 been code. 270 271 272 1.0.1l (11 May 2008): 273 274 * Now removing the UTF-8 BOM at the start of a document, if present. 275 276 * Now accepting capitalized URI schemes (such as HTTP:) in automatic 277 links, such as `<HTTP://EXAMPLE.COM/>`. 278 279 * Fixed a problem where `<hr (a] example.com>` was seen as a horizontal 280 rule instead of an automatic link. 281 282 * Fixed an issue where some characters in Markdown-generated HTML 283 attributes weren't properly escaped with entities. 284 285 * Fix for code blocks as first element of a list item. Previously, 286 this didn't create any code block for item 2: 287 288 * Item 1 (regular paragraph) 289 290 * Item 2 (code block) 291 292 * A code block starting on the second line of a document wasn't seen 293 as a code block. This has been fixed. 294 295 * Added programatically-settable parser properties `predef_urls` and 296 `predef_titles` for predefined URLs and titles for reference-style 297 links. To use this, your PHP code must call the parser this way: 298 299 $parser = new Markdwon_Parser; 300 $parser->predef_urls = array('linkref' => 'http://example.com'); 301 $html = $parser->transform($text); 302 303 You can then use the URL as a normal link reference: 304 305 [my link][linkref] 306 [my link][linkRef] 307 308 Reference names in the parser properties *must* be lowercase. 309 Reference names in the Markdown source may have any case. 310 311 * Added `setup` and `teardown` methods which can be used by subclassers 312 as hook points to arrange the state of some parser variables before and 313 after parsing. 314 315 316 Extra 1.1.7 (26 Sep 2007): 317 318 1.0.1k (26 Sep 2007): 319 320 * Fixed a problem introduced in 1.0.1i where three or more identical 321 uppercase letters, as well as a few other symbols, would trigger 322 a horizontal line. 323 324 325 Extra 1.1.6 (4 Sep 2007): 326 327 1.0.1j (4 Sep 2007): 328 329 * Fixed a problem introduced in 1.0.1i where the closing `code` and 330 `pre` tags at the end of a code block were appearing in the wrong 331 order. 332 333 * Overriding configuration settings by defining constants from an 334 external before markdown.php is included is now possible without 335 producing a PHP warning. 336 337 338 Extra 1.1.5 (31 Aug 2007): 339 340 1.0.1i (31 Aug 2007): 341 342 * Fixed a problem where an escaped backslash before a code span 343 would prevent the code span from being created. This should now 344 work as expected: 345 346 Litteral backslash: \\`code span` 347 348 * Overall speed improvements, especially with long documents. 349 350 351 Extra 1.1.4 (3 Aug 2007): 352 353 1.0.1h (3 Aug 2007): 354 355 * Added two properties (`no_markup` and `no_entities`) to the parser 356 allowing HTML tags and entities to be disabled. 357 358 * Fix for a problem introduced in 1.0.1g where posting comments in 359 WordPress would trigger PHP warnings and cause some markup to be 360 incorrectly filtered by the kses filter in WordPress. 361 362 363 Extra 1.1.3 (3 Jul 2007): 364 365 * Fixed a performance problem when parsing some invalid HTML as an HTML 366 block which was resulting in too much recusion and a segmentation fault 367 for long documents. 368 369 * The markdown="" attribute now accepts unquoted values. 370 371 * Fixed an issue where underscore-emphasis didn't work when applied on the 372 first or the last word of an element having the markdown="1" or 373 markdown="span" attribute set unless there was some surrounding whitespace. 374 This didn't work: 375 376 <p markdown="1">_Hello_ _world_</p> 377 378 Now it does produce emphasis as expected. 379 380 * Fixed an issue preventing footnotes from working when the parser's 381 footnote id prefix variable (fn_id_prefix) is not empty. 382 383 * Fixed a performance problem where the regular expression for strong 384 emphasis introduced in version 1.1 could sometime be long to process, 385 give slightly wrong results, and in some circumstances could remove 386 entirely the content for a whole paragraph. 387 388 * Fixed an issue were abbreviations tags could be incorrectly added 389 inside URLs and title of links. 390 391 * Placing footnote markers inside a link, resulting in two nested links, is 392 no longer allowed. 393 394 395 1.0.1g (3 Jul 2007): 396 397 * Fix for PHP 5 compiled without the mbstring module. Previous fix to 398 calculate the length of UTF-8 strings in `detab` when `mb_strlen` is 399 not available was only working with PHP 4. 400 401 * Fixed a problem with WordPress 2.x where full-content posts in RSS feeds 402 were not processed correctly by Markdown. 403 404 * Now supports URLs containing literal parentheses for inline links 405 and images, such as: 406 407 [WIMP](http://en.wikipedia.org/wiki/WIMP_(computing)) 408 409 Such parentheses may be arbitrarily nested, but must be 410 balanced. Unbalenced parentheses are allowed however when the URL 411 when escaped or when the URL is enclosed in angle brakets `<>`. 412 413 * Fixed a performance problem where the regular expression for strong 414 emphasis introduced in version 1.0.1d could sometime be long to process, 415 give slightly wrong results, and in some circumstances could remove 416 entirely the content for a whole paragraph. 417 418 * Some change in version 1.0.1d made possible the incorrect nesting of 419 anchors within each other. This is now fixed. 420 421 * Fixed a rare issue where certain MD5 hashes in the content could 422 be changed to their corresponding text. For instance, this: 423 424 The MD5 value for "+" is "26b17225b626fb9238849fd60eabdf60". 425 426 was incorrectly changed to this in previous versions of PHP Markdown: 427 428 <p>The MD5 value for "+" is "+".</p> 429 430 * Now convert escaped characters to their numeric character 431 references equivalent. 432 433 This fix an integration issue with SmartyPants and backslash escapes. 434 Since Markdown and SmartyPants have some escapable characters in common, 435 it was sometime necessary to escape them twice. Previously, two 436 backslashes were sometime required to prevent Markdown from "eating" the 437 backslash before SmartyPants sees it: 438 439 Here are two hyphens: \\-- 440 441 Now, only one backslash will do: 442 443 Here are two hyphens: \-- 444 445 446 Extra 1.1.2 (7 Feb 2007) 447 448 * Fixed an issue where headers preceded too closely by a paragraph 449 (with no blank line separating them) where put inside the paragraph. 450 451 * Added the missing TextileRestricted method that was added to regular 452 PHP Markdown since 1.0.1d but which I forgot to add to Extra. 453 454 455 1.0.1f (7 Feb 2007): 456 457 * Fixed an issue with WordPress where manually-entered excerpts, but 458 not the auto-generated ones, would contain nested paragraphs. 459 460 * Fixed an issue introduced in 1.0.1d where headers and blockquotes 461 preceded too closely by a paragraph (not separated by a blank line) 462 where incorrectly put inside the paragraph. 463 464 * Fixed an issue introduced in 1.0.1d in the tokenizeHTML method where 465 two consecutive code spans would be merged into one when together they 466 form a valid tag in a multiline paragraph. 467 468 * Fixed an long-prevailing issue where blank lines in code blocks would 469 be doubled when the code block is in a list item. 470 471 This was due to the list processing functions relying on artificially 472 doubled blank lines to correctly determine when list items should 473 contain block-level content. The list item processing model was thus 474 changed to avoid the need for double blank lines. 475 476 * Fixed an issue with `<% asp-style %>` instructions used as inline 477 content where the opening `<` was encoded as `<`. 478 479 * Fixed a parse error occuring when PHP is configured to accept 480 ASP-style delimiters as boundaries for PHP scripts. 481 482 * Fixed a bug introduced in 1.0.1d where underscores in automatic links 483 got swapped with emphasis tags. 484 485 486 Extra 1.1.1 (28 Dec 2006) 487 488 * Fixed a problem where whitespace at the end of the line of an atx-style 489 header would cause tailing `#` to appear as part of the header's content. 490 This was caused by a small error in the regex that handles the definition 491 for the id attribute in PHP Markdown Extra. 492 493 * Fixed a problem where empty abbreviations definitions would eat the 494 following line as its definition. 495 496 * Fixed an issue with calling the Markdown parser repetitivly with text 497 containing footnotes. The footnote hashes were not reinitialized properly. 498 499 500 1.0.1e (28 Dec 2006) 501 502 * Added support for internationalized domain names for email addresses in 503 automatic link. Improved the speed at which email addresses are converted 504 to entities. Thanks to Milian Wolff for his optimisations. 505 506 * Made deterministic the conversion to entities of email addresses in 507 automatic links. This means that a given email address will always be 508 encoded the same way. 509 510 * PHP Markdown will now use its own function to calculate the length of an 511 UTF-8 string in `detab` when `mb_strlen` is not available instead of 512 giving a fatal error. 513 514 515 Extra 1.1 (1 Dec 2006) 516 517 * Added a syntax for footnotes. 518 519 * Added an experimental syntax to define abbreviations. 520 521 522 1.0.1d (1 Dec 2006) 523 524 * Fixed a bug where inline images always had an empty title attribute. The 525 title attribute is now present only when explicitly defined. 526 527 * Link references definitions can now have an empty title, previously if the 528 title was defined but left empty the link definition was ignored. This can 529 be useful if you want an empty title attribute in images to hide the 530 tooltip in Internet Explorer. 531 532 * Made `detab` aware of UTF-8 characters. UTF-8 multi-byte sequences are now 533 correctly mapped to one character instead of the number of bytes. 534 535 * Fixed a small bug with WordPress where WordPress' default filter `wpautop` 536 was not properly deactivated on comment text, resulting in hard line breaks 537 where Markdown do not prescribes them. 538 539 * Added a `TextileRestrited` method to the textile compatibility mode. There 540 is no restriction however, as Markdown does not have a restricted mode at 541 this point. This should make PHP Markdown work again in the latest 542 versions of TextPattern. 543 544 * Converted PHP Markdown to a object-oriented design. 545 546 * Changed span and block gamut methods so that they loop over a 547 customizable list of methods. This makes subclassing the parser a more 548 interesting option for creating syntax extensions. 549 550 * Also added a "document" gamut loop which can be used to hook document-level 551 methods (like for striping link definitions). 552 553 * Changed all methods which were inserting HTML code so that they now return 554 a hashed representation of the code. New methods `hashSpan` and `hashBlock` 555 are used to hash respectivly span- and block-level generated content. This 556 has a couple of significant effects: 557 558 1. It prevents invalid nesting of Markdown-generated elements which 559 could occur occuring with constructs like `*something [link*][1]`. 560 2. It prevents problems occuring with deeply nested lists on which 561 paragraphs were ill-formed. 562 3. It removes the need to call `hashHTMLBlocks` twice during the the 563 block gamut. 564 565 Hashes are turned back to HTML prior output. 566 567 * Made the block-level HTML parser smarter using a specially-crafted regular 568 expression capable of handling nested tags. 569 570 * Solved backtick issues in tag attributes by rewriting the HTML tokenizer to 571 be aware of code spans. All these lines should work correctly now: 572 573 <span attr='`ticks`'>bar</span> 574 <span attr='``double ticks``'>bar</span> 575 `<test a="` content of attribute `">` 576 577 * Changed the parsing of HTML comments to match simply from `<!--` to `-->` 578 instead using of the more complicated SGML-style rule with paired `--`. 579 This is how most browsers parse comments and how XML defines them too. 580 581 * `<address>` has been added to the list of block-level elements and is now 582 treated as an HTML block instead of being wrapped within paragraph tags. 583 584 * Now only trim trailing newlines from code blocks, instead of trimming 585 all trailing whitespace characters. 586 587 * Fixed bug where this: 588 589 [text](http://m.com "title" ) 590 591 wasn't working as expected, because the parser wasn't allowing for spaces 592 before the closing paren. 593 594 * Filthy hack to support markdown='1' in div tags. 595 596 * _DoAutoLinks() now supports the 'dict://' URL scheme. 597 598 * PHP- and ASP-style processor instructions are now protected as 599 raw HTML blocks. 600 601 <? ... ?> 602 <% ... %> 603 604 * Fix for escaped backticks still triggering code spans: 605 606 There are two raw backticks here: \` and here: \`, not a code span 607 608 609 Extra 1.0 - 5 September 2005 610 611 * Added support for setting the id attributes for headers like this: 612 613 Header 1 {#header1} 614 ======== 615 616 ## Header 2 ## {#header2} 617 618 This only work only for headers for now. 619 620 * Tables will now work correctly as the first element of a definition 621 list. For example, this input: 622 623 Term 624 625 : Header | Header 626 ------- | ------- 627 Cell | Cell 628 629 used to produce no definition list and a table where the first 630 header was named ": Header". This is now fixed. 631 632 * Fix for a problem where a paragraph following a table was not 633 placed between `<p>` tags. 634 635 636 Extra 1.0b4 - 1 August 2005 637 638 * Fixed some issues where whitespace around HTML blocks were trigging 639 empty paragraph tags. 640 641 * Fixed an HTML block parsing issue that would cause a block element 642 following a code span or block with unmatched opening bracket to be 643 placed inside a paragraph. 644 645 * Removed some PHP notices that could appear when parsing definition 646 lists and tables with PHP notice reporting flag set. 647 648 649 Extra 1.0b3 - 29 July 2005 650 651 * Definition lists now require a blank line before each term. Solves 652 an ambiguity where the last line of lazy-indented definitions could 653 be mistaken by PHP Markdown as a new term in the list. 654 655 * Definition lists now support multiple terms per definition. 656 657 * Some special tags were replaced in the output by their md5 hash 658 key. Things such as this now work as expected: 659 660 ## Header <?php echo $number ?> ## 661 662 663 Extra 1.0b2 - 26 July 2005 664 665 * Definition lists can now take two or more definitions for one term. 666 This should have been the case before, but a bug prevented this 667 from working right. 668 669 * Fixed a problem where single column table with a pipe only at the 670 end where not parsed as table. Here is such a table: 671 672 | header 673 | ------ 674 | cell 675 676 * Fixed problems with empty cells in the first column of a table with 677 no leading pipe, like this one: 678 679 header | header 680 ------ | ------ 681 | cell 682 683 * Code spans containing pipes did not within a table. This is now 684 fixed by parsing code spans before splitting rows into cells. 685 686 * Added the pipe character to the backlash escape character lists. 687 688 Extra 1.0b1 (25 Jun 2005) 689 690 * First public release of PHP Markdown Extra. 691 692 693 Copyright and License 694 --------------------- 695 696 Copyright (c) 2004-2005 Michel Fortin 697 <http://www.michelf.com/> 698 All rights reserved. 699 700 Based on Markdown 701 Copyright (c) 2003-2005 John Gruber 702 <http://daringfireball.net/> 703 All rights reserved. 704 705 Redistribution and use in source and binary forms, with or without 706 modification, are permitted provided that the following conditions are 707 met: 708 709 * Redistributions of source code must retain the above copyright 710 notice, this list of conditions and the following disclaimer. 711 712 * Redistributions in binary form must reproduce the above copyright 713 notice, this list of conditions and the following disclaimer in the 714 documentation and/or other materials provided with the 715 distribution. 716 717 * Neither the name "Markdown" nor the names of its contributors may 718 be used to endorse or promote products derived from this software 719 without specific prior written permission. 720 721 This software is provided by the copyright holders and contributors "as 722 is" and any express or implied warranties, including, but not limited 723 to, the implied warranties of merchantability and fitness for a 724 particular purpose are disclaimed. In no event shall the copyright owner 725 or contributors be liable for any direct, indirect, incidental, special, 726 exemplary, or consequential damages (including, but not limited to, 727 procurement of substitute goods or services; loss of use, data, or 728 profits; or business interruption) however caused and on any theory of 729 liability, whether in contract, strict liability, or tort (including 730 negligence or otherwise) arising in any way out of the use of this 731 software, even if advised of the possibility of such damage. 732